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Abstract 



p<| We show that adaptive time stepping in particle accelerator simulation is an enhancement for certain problems. The new algorithm 
has been implemented in the OPAL (Object Oriented Parallel Accelerator Library) framework. The idea is to adjust the frequency 
of costly self field calculations, which are needed to model Coulomb interaction (space charge) effects. In analogy to a Kepler 
orbit simulation that requires a higher time step resolution at the close encounter, we propose to choose the time step based on 
the magnitude of the space charge forces. Inspired by geometric integration techniques, our algorithm chooses the time step 
proportional to a function of the current phase space state instead of calculating a local error estimate like a conventional adaptive 
^— h procedure. Building on recent work, a more profound argument is given on how exactly the time step should be chosen. An 
intermediate algorithm, initially built to allow a clearer analysis by introducing separate time steps for external field and self field 
integration, turned out to be useful by its own, for a large class of problems. 

i Keywords: Adaptive time stepping, Boris-Buneman, particle-in-cell, space charge, particle accelerator simulation 
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In recent years, precise beam dynamics simulations in the 
design of high-current low-energy hadron machines as well as 
of 4th generation light sources have become a very important 
research topic. Hadron machines are characterized by high cur- 
rents and hence require excellent control of beam losses and/or 
keeping the emittance (a measure of the phase space) of the 
beam in narrow ranges. This is a challenging problem which re- 
quires the accurate modeling of the dynamics of a large ensem- 
ble of macro or real particles subject to complicated external 
focusing, accelerating and wake-fields, as well as the self-fields 
caused by the Coulomb interaction of the particles. 

The simulation method discussed in this paper is part of a 
general accelerator modeling tool, OPAL (Object Oriented Par- 
allel Accelerator Library) [ 1 1. OPAL allows to tackle the most 
challenging problems in the field of high precision particle ac- 
celerator modeling. These include the simulation of high power 
hadron accelerators and of next generation light sources. Re- 
cent physics proposals include J2H4]]> a ll of them require large 
scale particles based simulation in order to design and optimize 
the required high power hadron machines. 

Here, we discuss methods which track the orbit of each par- 
ticle individually, with time as the independent variable. Accu- 
rate modeling demands the usage of many simulation particles, 
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which of course makes this approach expensive. The flow of a 
particle is described as an initial value problem for a differential 
equation, therefore such methods are called (time) integrators. 
What is common to them is that they create a discrete trajec- 
tory that approximates the solution of the initial value problem. 
How they transport a given state from time t„ to t n+ \ is crucial 
for accuracy and computational effort. 

Currently, only time integrators that use constant time steps 
Af = 

fn+i — t n are utilized in OPAL. The goal of the recent mas- 
ter thesis (51 was to investigate whether an integrator that uses 
variable time steps can provide enhanced efficiency. Since the 
space charge solver, which computes the self field induced by 
the charge, requires all particles to be synchronized in time, 
there is a global time step shared among all the particles. Only 
this global time step is to be adapted. Other kinds of adaptation, 
like adaptive mesh refinement, are of separate concern and not 
the subject of this work. 

In lO two categories of adaptive time stepping schemes 
were identified: 

(a) conventional adaptivity that modifies the time step based 
on a local error estimate, and 

(b) geometric adaptive integration which aims at solving a 
regularized differential equation. 

Both categories adapt the time step based on the local dynam- 
ics. However, the way how this is achieved differs. In (a), the 
step size is changed such that a predicted local error is below 
some target value, assuming that the local error is roughly pro- 
portional to some power of the step size. The local error es- 
timate is usually obtained by comparing two local solutions of 
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different order, like, e.g., in Runge-Kutta schemes with step size 
control |6|. Typical for this kind of algorithms is that a step can 
be rejected and repeated if the new local error estimate is above 
the tolerance. On the other hand, the origin of (b), geometric 
adaptive integration, are the deficiencies of (a) when applied to 
systems with special structure, e.g. Hamiltonian systems. Inte- 
grators, made adaptive in the conventional way, lose their re- 
versibility or symplecticity properties. 

First, it was not clear, how to obtain an appropriate and 
cheap-to-compute local error estimate in our particle-in-cell ap- 
plication for approach (a), and the possibility that steps have to 
be rejected and repeated indicated that the memory consump- 
tion may increase. The reason why we considered approach 
(b) was the hope that it would present a light-weight alterna- 
tive with less overhead. Strict time-reversibility of the adaptive 
scheme was not a major issue. 

A Kepler orbit problem was chosen as a model problem of 
adaptive geometric integration. A leapfrog method, adaptive in 
the sense of (b), was used to perform smaller steps when the 
comet is near the sun and attracted by large forces, and larger 
steps when the comet is far away. Following this model, we de- 
rived an adaptive variant of the Boris-Buneman method. Where 
in the Kepler case we adapted the time step to the strength of 
the gravity force, in our application it is adapted to the strength 
of space charge forces. In both cases, the strength of the force 
is proportional to the square of the distance, just the sign and 
the coupling constant is different for the Coulomb force. 

A first implementation proved this type of adaptive time 
stepping to be beneficial for particle accelerator simulations, 
see the results on two photoinjector scenarios in [5 |. However, 
a clear argument was missing on how the time step should be 
chosen for optimal results. Moreover, time step requirements 
for external fields were not considered. In this follow-up work, 
based on a ID model problem, we give a sound motivation for 
the used adaptation scheme. The enhanced version of the adap- 
tive algorithm still chooses the time step based on space charge 
forces only, but allows finer inner time steps in a multiple-time- 
stepping fashion to account for fast varying external fields. 

The initial hope was that we can show the usefulness of 
variable time steps for many other scenarios too. However, the 
benefit over constant steps with our algorithm becomes visible 
only when variations in space charge forces are large. This is 
the case for space charge-dominated scenarios like a photoin- 
jector simulation, but e.g. not for simulations of high-energy 
cyclotron turns. For cases where variations in the space charge 
forces are small but space charge still has a visible effect, the 
multiple-time-stepping extension alone proved to be an impor- 
tant enhancement to the original Boris-Buneman method. Al- 
though it was thought to be only an intermediate algorithm and 
its derivation was relatively easy compared to the rest, this part 
is probably the achievement with broader practical impact. 

In Section [2] we state the problem. In Section [3] follows an 
introduction to the current time integration scheme available in 
OPAL. The new adaptive Boris-Buneman scheme is presented 
in Section [4] building on an multiple-time-stepping extension 
of the original method. In Section [5] we examine more closely 
the adaptive step size strategy by looking at a one-dimensional 



model problem. In Section [6] we benchmark the developed 
methods on applications from beam dynamics, and in Section|7] 
we draw our conclusions. 



2. Particle beam simulations with space charge 

2.1. The electrostatic particle-in-cell method 

We consider the Vlasov-Poisson description of the phase 
space, including external and self fields. Let f(x, v, t) be the 
density of the particles in the phase space, i.e. the position- 
velocity (x, v) space. Its evolution is determined by the colli- 
sionless Vlasov equation, 

d -L = d,f + v ■ V x / + i-(e + v x b) • V v / = 0, 
at m 

where m, q denote particle mass and charge, respectively. The 
electric and magnetic fields e and b are superpositions of exter- 
nal fields and self fields (space charge), 



e = e ext + e self , 



b = b ext + b self . 



(1) 



If e and b are known, then each particle can be propagated ac- 
cording to the equation of motion for charged particles in an 
electromagnetic field. After particles have moved, we have to 
update e selt and b self (among other things). For this, we change 
the coordinate system into the one moving with the particles. 
By means of the appropriate Lorentz transformation X 10 we 
arrive at a (quasi-) static approximation of the system in which 
the transformed magnetic field becomes negligible, b ~ 0. The 
transformed electric field is then obtained from 



e = e 



self 



= -V0, 



(2) 



where the electrostatic potential 4> is the solution of the Poisson 
problem 

X(p(x)) 



- A0(x) = 



£0 



(3) 



equipped with appropriate boundary conditions. Here, p de- 
notes the spatial charge density and £0 is the dielectric constant. 
By means of the inverse Lorentz transformation ( £.~ l ) the elec- 
tric field e can then be transformed back to yield both the elec- 
tric and the magnetic fields in ([T}. 

The Poisson problem |3]l discretized by finite differences 
can efficiently be solved on a rectangular grid by a Particle- 
In-Cell (PIC) approach [8]. The right hand side of <|3j is dis- 
cretized by sampling the particles at the grid points. In (|2]), is 
interpolated at the particle positions from its values at the grid 
points. We also note that the FFT-based Poisson solvers and 
similar approaches [ 8 9 1 are most effective in box-shaped or 
open domains. 

2.2. Equations of motion 

We integrate in time N identical particles, all having the rest 
mass m and charge q. The relativistic equations of motion for 



2 



particle i are 



where 



~d7 
dp, 
At 



Pi 

mji 



= 9 e* + 



x b, 



(4) 
(5) 



where x, is the position, p,- = mv,y; the relativistic momentum, 
Vi the velocity, y t = 1/y/l - (\\\i\\/c) 2 = + (||p,-||/(mc)) 2 the 
Lorentz factor and c the speed of light. The electric and mag- 
netic field, e, and b,-, can be decomposed into external field and 
self field contributions: 



e, = e ext (x,-, t) + e selt (/, x L jv, Pl.jv), 



b, = b ex W) + b se ^,xi..j V ,Pi.jv) 



(6) 
(7) 



The notation Xj.jv is a shorthand forxj,... , X/v, and is used for 
other vectors analogously. The self field describes the field cre- 
ated by the collection of particles i.e. the source of the Coulomb 
repulsion. The external electromagnetic field (from magnets 
etc.), which can have an explicit dependence on time f, are in 
this model treated independent of the other particles. 

3. Constant time step integration method 

The Boris-Buneman integration scheme IflOl pp. 58-63] 
solves the single particle equations of motion in electric and 
magnetic fields, and is a widely used orbit integrator in explicit 
PIC simulations and codes such as OPAL. The scheme is pop- 
ular because it is simple to implement and because it gives sec- 
ond order accuracy while requiring only one force evaluation 
per step. 

The integration method is similar to the leapfrog method 
(see e.g. [5 1) as it offsets position and momentum by half a time 
step and updates them alternatingly. However, we have to deal 
with a non-separable system as the momentum derivative de- 
pends on the momentum itself in the magnetic force term. This 
requires special treatment to retain good properties like time- 
symmetry. 

The classic derivation by Buneman and Boris is explained 
in [5 1 for the nonrelativistic case. The relativistic generalization 
is not difficult [10 pp. 356-357]. Here, we develop the method 
by an operator splitting approach fTTJ sec. II.5], directly work- 
ing on the relativistic equations of motion. We will see that this 
yields exactly the same method. While this alternative deriva- 
tion itself is not strictly necessary for this paper, it serves to 
introduce the operator-notation used to characterize the new in- 
tegrator. In the operator-notation we make a simplifying as- 
sumption about the fields and look at only one particle, in order 
to allow the reader to quickly grasp the ideas. In the algorithms 
we will give all details how the integrators are implemented in 
our specific application. 

We now look at a single particle and assume that the fields 
depend only on position x, i.e. do not depend on p and are con- 
stant over time. Let the equations of motion be written as 



d 

d7 



fx(p) 






f £ (x) 





ffl(x,p) 



(8) 



fx(p) = p/(my(p)), 
f £ (x) = qe(x), 
f B (x,p) = qpx b(x)/(my(p)). 

For simplified systems where only one term of the right hand 
side of <(8j exists we can make the following statements: 

• If only fx was present in the RHS, then 

Drift, ■' x \,Jx + «*(P) 

would be the flow of the system, i.e. would integrate the 
system exactly. 

If only f E was present in the RHS, then 

E-Kick/, : I X I i-> ( , , - 
\ p / \ P + hf E (x) 

would be the flow of the system. 

If only fs was present in the RHS, then 



B-Kick,, 



p + M B (x,p) 



would be a numerical first-order approximation to the 
flow of the system (forward Euler method). 

Having identified these parts, we construct a second-order in- 
tegrator by using a suitably ordered composition of these map- 
pings. By two nested applications of equation (5.9) IfTTI p. 49] 
(Combining Exact and Numerical Flows), 

O/, = B-Kick/, o E-Kick A o Drift/, 

is a first-order integrator for the combined system. By com- 
posing the operator <t>/,/2 with its adjoint <t>^ 2 a second order 
integrator is obtained IfTTI p. 45], 

BB/, = 0>* /2 o <t>/, /2 = ((D-A/ar 1 o % /2 
= Drift/,/2 o 

E-Kick/,/2 o (B-Kick_/,/2)~' ° B-Kick/,/2 ° E-Kick/,/2 



Kick/, 



o Drift, 



h/2, 



where the second equality recalls the definition of the adjoint. 
This is already the Boris-Buneman-method. However, the mag- 
netic field kick in the middle 



*** = (B-Kick^)- 1 o B-Kick, l/2 I* 
p** / \ P* 



is an implicit mapping. The p-component is 

p„* = p* + hqp* x b(x„)/(2my(p»))+ 
hqp»» x b(x«)/(2my(p«)), 



which is recognizable as the trapezoidal integration rule. Using 
that x» = x« and that the length of the momentum vector is 
invariant in the B-field rotation (y stays constant), we have 

p** p* p* ~*~ p** ^ \)( ) 
h ^ 2my(p t ) 

Boris' important contribution is the way how p*» can be calcu- 
lated explicitly, thus making the whole method explicit. 

In our application, the fields do not only depend on the po- 
sition. But the change with respect to momentum and time 
are typically small within a time step, so the properties of the 
method still hold approximatively (assuming the time steps are 
not chosen too large). The method is implemented in OPAL as 
described in Algorithms 1, 2, and 3. 



Algorithm 1 BB(x L .jy, p L . jy, f end , h) 

f <- 

while t < fend do 

(Xi.,jvr,0 <- Drift (ft/2, Xi..jvt,Pi...jv,0 
(ei.,jv, bi..jv) <— ExternalFields (Xi,jv> + 

SelfField (Xi..jv, Pl.jv) 
Pl.jv *~ Kick(ft,pi.., A r,ei.. JV ,bi,., JV ) 
(xi...jvt,0 <- Drift (/i/2,x l .jv, Pl.jv, 

end while 



Algorithm 2 Drift(/i, Xl.jv, Pl.jv, f) 
for i = 1 t o N do 

r «- +p7Pi/(mc) 2 

x,- <- x,- + hpi/imy) 
end for 
t *- t + h 
return (xl.jv, t) 



Algorithm 3 Kick(/i, pi.../v, ei.../v, bi..jv) 
for i = 1 to N do 
Pi <- p, + hqd/2 

y ^1 +pjpi/(mc) z 

r <— hqbi/(2my) 

w <- p, + p, x r 

s <- 2r/(l +r T r) 

Pi <— p, + W X s 

Pi <- p, + hqd/2 
end for 
return Pl.jv 



4. Adaptive time step integration method 

4.1. Multiple-time-stepping (MTS) variant 

Before discussing our adaptive step size variant of the Bo- 
ris-Buneman integrator, we discuss a simple but powerful ex- 
tension to the fixed time integration scheme from chapter 3. We 



are following [11, Chapter VIII 4.1] and write the differential 
equation with a fast-slow splitting as 

— = f ext + f self . (9) 
df 

jseif correS p 0nc j s to the slow dynamics and is also the most ex- 
pensive term to evaluate, namely the space charge forces. The 
fast dynamics is governed by f ext , and in our particular example, 
this term arises from fast varying external fields in an acceler- 
ator. For details see Algorithm 4.1 and Lemma 4.2 of ifTTl sec. 
VIII.4]. We state a multiple-time-stepping variant of the Boris- 
Buneman integrator as 

MTS"' = Kick$ o (BB- )'" o Kick$ (10) 

where the m > 1 substeps are defined by 

BB^ xt = Dri% 2 o Kickf 1 o Drift /l/2 . (11) 

The superscripts "ext" and "self" of the Kick-operator denote 
that only external and self-field forces, respectively, are used 
for the momentum update. The pseudo-code of the method is 
shown in Algorithm [4j which refers to Algorithm [5] and to al- 
ready presented Algorithms 1, 2 and 3. 



Algorithm 4 MTS(xi„jv, pi.jv, fend, h, m) 

(ef^bfy «- SelfField (x 1 ...^,p 1 ..^) 
t<-0 

while t < fend do 

Pl.jv <- Kick (h/2, p u jv.e£$,,b«*J 
for i = 1 , . . . , m do 

(xi..jv, Pl.jv, <- BB ext (h/m,x hM ,p K , N ,i) 
end for 

to> b i e y «" SelfField (xi...Af,pi..jv) 

Pl.jv «- Kick (h/2, Pl.jv, e^, bf^) 
end while 



Algorithm 5 BB ext (/i,x L .jv, Pl.jv, f) 

(Xi.jv.O *~ Drift (/i/2,x l .jv, Pl.jv, 
(e e * l N , bj* 1 ^) <— ExternalFields (Xi.,jv, 

Pi..jv <- Kick(ft,pi... 2V ,ef x, Ar ,b^ JV ) 
(xi..jv,0 <- Drift (A/2,Xi.jv,pi..jr,0 
return (Si..jy, Pi_jv> 



4.2. Adaptive step size variant 

We explained in the introduction why geometric adaptive 
integration was used as the starting point in the design of our 
algorithm. Briefly, the simplicity of choosing the time step pro- 
portional to the system's state without having to deal with error 
estimates and step rejection was compelling, and the results of 
the previous work [5| motivated to further investigate this ap- 
proach. 

The concept of adaptive geometric integration is to look at 
a transformed system that evolves in a fictitious time r. The 



4 



relation that connects the real time to it is called a Sundman 
transformation 



dt 
dr 



= «<*)• 



(12) 



The function g determines the time rescaling and depends on 
the state of the system. It has to be chosen problem-dependent. 
The idea is that the rescaled system can be integrated with con- 
stant steps because the dynamics are regularized. When g is 
large, ?(r) increases rapidly and we take large steps in real time. 
When g is small, f(r) increases slowly and we take small steps. 
Where the evolution of the system over t was described with 



dz 
df 



f(z,f), 



(13) 



the transformed equation with independent variable t becomes, 
by the chain rule of differentiation, 



dz 
dr 



dz dt 
dt dr 



= f(z,f)g(z). 



(14) 



The curve z(t) is identical to z(f), but the trajectories are tra- 
versed at different velocities with respect to the independent 
variable. Constant steps taken in the fictitious time r correspond 
to variable steps in the real time t, excluding the uninteresting 
case of a constant g. Multiple methods exist to solve such a 
transformed system. 

Preserving time symmetry in the construction of the adap- 
tive integration scheme is an important point for some applica- 
tions, especially in few body settings like the Kepler problem. 
In we noted that this symmetry is not important for our PIC 
application, as a consequence we do not put effort in enforcing 
this symmetry, favoring simplicity. The core concept we take 
over from the adaptive geometric integration technique is the 
choice of the time step proportional to some - cheap to evaluate 
- function of the current state. 

How should we choose the function g for our PIC applica- 
tion? Our inspiration came from the treatment of the Kepler 
problem. There the adaptive geometric integration technique 
demands smaller step sizes in situations where the force is large, 
i.e. when the two bodies are close to each other. In our case, we 
similarly want to demand smaller step sizes when particles are 
near to each other, i.e. when the beam volume is small and the 
space charge forces are large. Of course there is still a vari- 
ety of ways how this can be done, especially because we have 
not only two bodies but a collection of particles. The key point 
here is: this approach to step size adaption is based solely on 
the strength of the space charge force, and not on the properties 
of the external fields. While this sounds limiting, the regular- 
ization of space charge solve frequency was the most evident 
opportunity to save computation cost in our application. It is 
certainly possible to extend the method with some sort of adap- 
tivity for external fields, if required. 

In order to determine g, for every particle, we compute its 
acceleration due to the space charge field with 



a, = 



d\ 
df 5 " 



1 



pself „ 



1 



m 2 c 2y2 



-Tiself 
P, f , 



(15) 



where f self is the right hand side of Q using only the self field 
contribution. The acceleration of largest magnitude among all 
particles is now related to the value of g as 



(pi..jv,fi elf A ,) oc (max||a,||) 



-m 



(16) 



Because only proportionality is important, one can leave out 
constants in the calculation of g. Section [5] will give an ar- 
gument why /3 — 1.0 is a good choice by analyzing adaptive 
integration of a one-dimensional model problem. In our exper- 
iments the overhead for calculating g was negligible compared 
to the total time spent, but one can easily make this cheaper if 
required, e.g. by dropping the second term of the parenthesized 
part in ( fT~5] > or by using a mean y, avoiding per-particle square 
root operations. 

In 0, we used a simpler version of g which was based 
on the beam size directly. Also, there was only one time step. 
Here, we combine separate time steps for external and self fields 
together with variable step sizes for self fields. The MTS algo- 
rithm (see Alg.|4]i is made adaptive ("AMTS") by 

• using a variable outer time step h adapted to space charge 
strength, and, 

• using a variable to e N such that the inner time step h/m 
for external field integration is kept roughly at same size 
if possible. 

The inner time step, of course, can never be higher than the 
outer time step. But as soon the outer time step exceeds some 
value, to is increased to keep the external field step roughly at its 
original size. See Algorithm [6] for the full method. The initial 
step sizes must be specified as input. 



Algorithm 6 AMTS(Xi.jv, Pi.jv, t end , Af ( mit 
(eSv-Mfiv) «" SelfField(xi.. JV> p 1 ... /v ) 
*<-*(pi...*.e&.b^) 
At AQl/A 
t^O 

while t < f end do 



ouier' dinner 



A ■ At 

- [/i/Afinner] 



h « 
to 

P1.JV <- Kick (h/2, Pi.jv.e^.bf^) 
for i - 1 , . . . , m do 

(xl.jv, Pi.jv, O <- BB ext (h/m, Xi... n ,Pl..n, 
end for 

( e Sv> b i e y «" SelfField(x 1 .. JV ,p 1 .. JV ) 
pi.jv <- Kick (h/2, Pi.jv, e^bf^) 



end while 



5. Inverse square force integration 

5.1. Model problem 

To better understand which adaptive strategy is optimal, a 
simple model problem is considered. We look at the motion 



5 



of a single particle in one dimension, under the influence of a 
repulsive, Coulomb like force. For position x and velocity v of 
a single body, let the equations of motion be 



dx 
~dt 
dv 
dr 



= v, 



(17) 
(18) 



The force is proportional to the inverse square of the distance to 
the origin. Figure [T] shows the solution for the position and ve- 
locity as function of time. We can already 'guess' that a smaller 




Figure 1: Solution of the ID hard soft-wall collision problem. Shown are posi- 
tion x(t) and velocity v(f) as function of time. 

time step near the minimum of x (closest encounter) should be 
used. But we don't know how exactly the step size should de- 
pend on the situation. Therefore we investigate how the choice 
of the time rescaling function g (from some class of candidates) 
influences the error of adaptive integration methods. The accu- 
racy is measured by comparing to an analytical solution. 

5.2. Analytical solution 

This specific case allows an analytic treatment. The solution 
will not be explicit, but can be calculated by finding a root of 
some function. 

Given the initial conditions Xo > and Vo, we determine 
xl = x(ti) as the minimum of the curve x(t). The energy 
H(x,v) = v 2 /2 + l/x stays constant along the solution, and 
knowing v(ti) = leads to 



xl = 



2x 



2 + vIxq 



We can give the velocity (only positive solution used here) for 
any x > x^ by using the energy argument again: 



V(x) 



Rearranging ( [18) using dv/df = dv/dx ■ dx/dt = dv/dx ■ v lets 
us integrate on both sides: 




Jxi Ax JO 



Evaluating and simplifying gives us the (positive) time for the 
motion from x^ up to x with 



T(x) = ^ ^x(x - x L ) + x L log I 



The final solution is found in two steps. First, we compute the 
relative time needed to find the minimal x, 



tL 



-T(x L ), ifv >0, 
T(xl), otherwise, 



then we compute the final condition from 

( T -\ 



x(t) = 



v(r) = 



T-Hh 



(t - f L ), if t > t L , 

l - t), otherwise, 

V{x(t)\ if t>t L , 
-V(x(t)), otherwise. 



To compute T 1 (f), we find the root of T(x) - t by bisecting in 
the interval [x L ,x L + f 2 /x L J. 

5.3. Experiment and findings 

We already know that the numerical integration efficiency 
can be increased by concentrating effort on parts where the 
force is large, see e.g. the Kepler problem in [5|. The con- 
sidered adaptive methods employ a Sundman transformation of 
the form 

a. ia2 \-PI 2 

%->>*-*■$) 

applied to the system ([17} - ([18). In other words, we aim at 
making the variable timestep proportional to g. Of course there 
are other possibilities how to choose the transformation, but the 
dependeny on solely x means that the timestep will be a func- 
tion of the magnitude of the acceleration/force. For problems 
with many bodies (e.g. our PIC simulation), we then can use the 
maximal acceleration/force for the choice of a global timestep. 
The goal is now to look more closely at the exponent ft and ask 
for an optimal value. 

In the following, we integrate three different scenarios until 
fend = 20. The initial conditions are chosen such that the lowest 
point is reached at <l = 10, i.e. xq = r _1 (10) and vo = — V(xq). 
The scenarios differ in the energy level H. A higher energy 
level means the magnitude of the initial velocity is larger and 
the minima appear at a lower x^. For example, the energy level 
H — 10° corresponds to Fig. [I] where xl = 1. The adaptive 
Verlet method (see references in Q) is used for integration. 
Since there are multiple variants of this method that differ in 
operator splitting and timestep adaption, we give Algorithm[7]to 
remove ambiguity and to allow easy verification of the results. 
It can be expected that the essence of the result (what is optimal 
jS) also holds for other similar adaptive methods. The initial 
timestep was chosen such that each run reached f enc j in 1000 
steps. 

Figure [2] shows that exponent /3 — 1 is a good choice, where 
the difference to other exponents gets pronounced at higher en- 
ergy levels. We also experimented with other variants of the 
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Algorithm 7 Adaptive VerletSimulation(xo, vn, / en d, g, Afo) 



1 


A ±- g(x {) ) 


2 


At <- At /A 


3 


(x, v, t) <- Oo, v'o, 0) 


4 


while ? < fend do 


5 


v <- v + AtvI/2 ■ 1 /x 2 


6 


x <— x + At/1/2 • v 


7 


t <^ t + AtA/2 


8 


A*-t+l/(2/g(x)-l/A) 


9 


x <— x + At/1/2 ■ v 


10 


v <- v + Ari/2 • 1 /x 2 


11 


f <- f + Ar/l/2 


12 


end while 



adaptive Verlet method and a symplectic adaptive method (see 
p. 23]). They all showed similar behaviour. 



0.1 




1e-06 1 1 1 1 1 ' 1 1 1 1 ' 1 1 1 1 1 

0.5 1 1.5 

Exponent p 

Figure 2: Error of adaptive time step integration depending on exponent choice 
in Sundman transformation. Missong points in the figure indicate that the re- 
spective simulation runs broke down, e.g. because of negative x values pro- 
duced during integration. 

We see a discrepancy compared to what Ifl2ll considers as 
optimal in this so-called hard soft-wall problem and note that 
in |[T2l not only Coulomb potentials but a whole class of l/x" 
potentials are studied. If we insert a — 1 for our case, the 
choice /3 — 3/2 is proposed. Their findings are based on a scale 
invariance argument and they verify the result with experiments 
for larger a than in our case. It could be that for the specific case 
a = 1, the general result is too approximative. In our opinion, 
authors previously using /3 — 3/2 should review this choice. 

6. Applications 

The presented algorithms were implemented and tested in 
the OPAL accelerator simulation framework fT). In the first 
experiment, we compare AMTS to MTS, and in the second ex- 
periment, we compare MTS to the existing Boris-Buneman in- 
tegrator. All simulations were performed with small resolution 
(few thousand particles and small PIC mesh resolution) on a 



laptop computer. But it can be expected that the results carry 
over to large-scale simulations, where of course the absolute 
gain in computation time is higher. 

6.1. Photoinjector 

We simulate a beam for the first half meter in a photoin- 
jector. This is an example in which space charge forces are 
important because of the low beam energy, and where the mag- 
ntiude of the space charge forces changes considerably during 
the simulation. See Fig. [3] for the energy curve, and Fig. [4] for 
the development of the beam size. 




8e-10 1.2e-09 1.6e-09 



Simulation time [s] 



Figure 3: Mean particle energy in the photoinjector scenario. 




Simulation time [s] 



Figure 4: Transversal and longitudinal root-mean-square size of the beam for 
the photoinjector. 

We show that the adaptive step size variant of the multiple- 
time-stepping integrator (AMTS) is more efficient than the con- 
stant MTS integrator. To allow a comparison of the differ- 
ent integration strategies with respect to space charge effects, 
we want to treat external field integration as much as possible 
equally among all runs. In MTS the inner time step is fixed to 
A/inner = 5 • 1CT 13 , and the outer step size is a multiple of it. In 
AMTS, as the outer time step increases, more substeps are used 
to maintain an inner time step around Afi nner . See Fig.[5]for the 
time step choice of an adaptive run. The strong variation in the 
space charge forces leads to a change in the (outer) time step of 
more than a factor 100. 

In Fig. [6] we give a comparison of the error made in trans- 
verse root-mean-square emittance (a measure of the phase space 
volume) for different amount of work spent on self field calcu- 
lations. The errors are with respect to a reference solution ob- 
tained using the MTS integrator with m = 1 and a step size of 
5 • 10~ 13 . AMTS reaches a given error with drastically fewer 
self field calculations. 
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Figure 5: Time step choice of an AMTS run in the photoinjector simulation. 
Because the method requires an integer number of substeps per outer step, the 
inner time step slightly fluctuates around the desired value. 
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Figure 6: Time integration error in the transverse root-mean-square emittance 
for the photoinjector simulation. Errors are with respect to a reference solution 
obtained with MTS using 3400 self field calculations. 

In Fig. [7] we compare different choices of the g-function, 
which determines how the variable time step is chosen. Al- 
though all choices are far better than a constant step, we see 
that our choice with /3 = 1.0, derived with the ID model, gives 
slightly better results than the other considered possibilities. It 
is also better than the choice from experiments reported in 0, 
which used a much simpler adaptation criterion, namely the 
beam size. 

Note that the actual time spent on these simulations is dom- 
inated by the external field integration, because the inner time 
step is kept very small. The reason why we choose the inner 
time step unpractically small is that the work spent for external 
field integration should be kept constant among the runs, such 
that the comparison using the number of self field calculations 
makes sense. For practical simulations, the inner time step can 
be chosen larger hence for the same accuracy, AMTS yields a 
significant reduction in time-to-solution compared to the other 



Figure 7: Comparison of different choices of the ^-function for the AMTS in- 
tegrator. 

integrators. 
6.2. Cyclotron 

The simulation library OPAL [1| has an own component 
for the simulation of cyclotrons. This component was recently 
enhanced with support for neighboring bunch effects fOl . in- 
creasing significantly the number of space charge calculations 
in a simulation. For this comparison, we simulate the first 10 
turns of the PSI 590 MeV ring cyclotron, with an initial particle 
energy of 72 MeV. First, we had to recognize that a variable step 
size (AMTS) brings no visible benefit over the plain MTS algo- 
rithm. This is understandable as the variation in space charge 
forces is far smaller than in the photoinjector case. Also, the 
influence of space charge is smaller because the beam energy 
is higher. Therefore, we use this case to report about the differ- 
ence between the original integrator and the MTS variant. 

We now want to observe how close a run with reduced space 
charge solve frequency (MTS with m > 1) comes to the origi- 
nal solution, and how much time we save. As measure, we take 
again transversal RMS emittance. Relative errors are calculated 
at five points per turn, and then the maximum of these errors 
counts as overall error. The timings are not meant to be a pre- 
cise profiling, but should give a hint how much time is roughly 
saved. In general, the savings depend mainly on the fraction of 
space charge solve cost to overall cost of the simulation, beside 
further details in the implementation. See Table [T] for the re- 
sults. Space charge has certainly an influence in this scenario, 
as the error of the solution without simulating space charge is 
not small. The surprising fact is that a drastically reduced space 
charge solve frequency is still very accurate, but saves a lot of 
time. 

The existing Boris-Buneman implementation in OPAL (with- 
out MTS extension) had an option to calculate the space charge 
forces only every n-th step, and then reusing the old forces in 
the next n - 1 time steps. This was a first step towards reducing 
computation cost. In this context it is interesting to see how this 
first approach compares to the new MTS method. In Table|2]we 
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SC solve frequency 


Error in % 


CPU time (seconds) 


m = 1 





524 


m = 1 


3.35 ■ 10~ 6 


357 


m = 4 


1.52- 10~ 5 


273 


m = 10 


1.15- 10~ 4 


222 


m = 20 


5.13- 10-* 


207 


m = 100 


8.89 ■ 10~ 2 


193 


No space charge 


12.0 


190 



Table 1 : Maximal relative error in transversal RMS emittance and timings of 
MTS integrator in the ring cyclotron simulation. Increasing the MTS substep 
number m (see (lOj) reduces the number of self field computations, thus saving 
time while introducing some deviation to the solution with full solve frequency 
(m = 1). The solution with no space charge effects modeled is given as com- 
parison. 

give errors and timings for this older approach. Again, as the 
timings depend on implementation details, only the big picture 
is important here. It is apparent that this, without prejustice 
seemingly reasonable scheme (reuse something that changes 
slowly), builds up a large error even with low values of n, and 
was therefore rarely used. It seems that reusing of old forces in- 
troduces an asymmetry which hurts the accuracy, and a proper 
MTS implementation should be used instead. 



SC solve frequency 


Error in % 


CPU time (seconds) 


n = 1 





473 


n = 2 


1.01 ■ 10" 1 


344 


n = 4 


2.62 ■ 10"' 


281 


n = 10 


1.45 


242 


n = 20 


5.46 


230 


n = 100 


15.4 


220 


No space charge 


12.0 


197 



Table 2: Maximal relative error in transversal RMS emittance and timings of 
modified Boris-Buneman integrator in the ring cyclotron simulation. Calculat- 
ing the self field only every n-th step reduces the number of self field compu- 
tations, thus saving time while introducing some deviation to the solution with 
full solve frequency (n = 1). The solution with no space charge effects modeled 
is given as comparison. 



7. Conclusions 

We presented two time integration schemes that enhance the 
standard Boris-Buneman algotitm by adapting the frequency of 
self field calculations. The usability was demonstrated within 
the OPAL particle accelerator framework, but the methods could 
be easily applied to similar problems. 

The multiple time stepping (MTS) extension allows to com- 
pute the self fields less frequently, by a factor which has to be 
defined beforehand. For many scenarios where space charge 
has a visible contribution but is not dominant, this method can 
save considerable computation time with only negligibly chang- 
ing the solution. While the multiple time stepping strategy is 
not new, we have not seen it applied in this context. Initially in- 
tended to be an intermediate algorithm only to derive the vari- 
able step size variant, MTS itself turned out to be of practi- 



cal relevance, as its implementation incurs hardly any overhead 
while the performance gain is substantial. 

The adaptive multiple time stepping (AMTS) algorithm in- 
troduces a variable step size integrator. Variable step sizes are 
most beneficial over MTS in space charge dominated (low en- 
ergy) simulations. If space charge forces change considerably, 
like in gun simulations, AMTS shows its strengths. While pre- 
vious work indicated that variable step sizes can be useful in 
such cases, this work gives further insight on how the step size 
should be adapted for good results. The foundation on MTS al- 
lows to have two individual time steps, an outer one that can be 
adapted to self field situation, and an inner one that can be kept 
small for external fields. The implementation is more compli- 
cated than for MTS. However, we have shown that important 
problems exist for which this additional effort pays off. 
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